Mandarin speech prosody: issues, pitfalls and directions
نویسنده
چکیده
From the perspective of speech technology development for unlimited Mandarin Chinese TTS, two issues appear most impedimental: (1.) how to predict prosody from text, and (2.) how to achieve better naturalness for speech output. These impediments somewhat brought out the major pitfalls in related research, i.e., characteristics of Chinese connected speech and the overall rhythmic structure of speech flow. This paper discusses where the problems stem from and how some solutions could be found. We propose that for Mandarin, prosody research needs to include the following: (1.) characteristics of Mandarin connected speech that constitute the prosodic properties in speech flow, i.e., units and boundaries, (2.) scope and type of speech data collected, i.e., text other than isolated sentences, (3.) prosody in relation to speech planning, i.e., information other than lexical, syntactic and semantic, and (4.) an overall organization of prosody for speech flow, i.e., a framework that accommodate the above mentioned features.
منابع مشابه
Issues in Chinese Prosody : Conceptual Foundations of a Linguistically-Motivated Text-to-Speech System for Mandarin
I examine various controversial aspects of Chinese prosody–tone structure, syllable structure, stress, and intonation–and stress the need to view all of these as interacting systems, aspects of a hierarchical prosodic structure. I examine various proposals at these various levels of the hierarchy and suggest which are most appropriate. Specifically, I suggest the adoption of Bao's version of sy...
متن کاملPerceptual relevance of pitch contours of Mandarin tones and its efficacy in prosody generation of speech synthesis
Modeling Mandarin tones is one of the most important issues in speech synthesis. However, established knowledge is mainly focused on the “production” aspect. In this paper, we first characterized relative pitch levels of tones. Next, two perceptual experiments were designed to investigate “perceptual” relevance of pitch levels and shapes in Mandarin. Results showed that relative pitch levels of...
متن کاملSpeech Rate and Prosody Units: Evidence of Interaction from Mandarin Chinese
This paper discusses evidence of interaction found between speech rate and prosody units in Mandarin Chinese speech. Mandarin speech data of 2 different speech rates that had been previously labeled for perceived boundaries and prosody units were further analyzed for duration patterns at each prosodic level. Each prosody level demonstrated patterns of duration adjustment for both speech rates t...
متن کاملAn automatic prosody labeling method for Mandarin speech
A new model-based automatic prosody labeling method for Mandarin speech is proposed. It first introduces four models to describe the relationships of the prosody tags to be labeled, the prosodic features of the speech signals, and the linguistic features of the associated texts. It then employs a sequential optimization procedure to estimate parameters of these four models and find all prosody ...
متن کاملUnsupervised prosody labeling for constructing Mandarin TTS
This paper introduces an unsupervised prosody labeling method for preparing a large speech corpus used in developing a Mandarin Text-to-Speech system. Adopting a four-layer prosody hierarchy, the proposed method performs an unsupervised segmental clustering that iteratively segments spoken utterances into strings of prosodic constituents and models the patterns of the segmented prosodic constit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003